Compiling for Increasing On-chip Parallelism
نویسندگان
چکیده
It becomes a trend that microprocessor companies are adding more and more parallelism on a chip to increase performance per chip. At the fine granularity level, vector instruction sets are added. While at the coarse granularity level, multiple cores are put on the same chip. This trend presents a challenge for application developers as well for compiler developers: how to exploit the power of these introduced parallelism? In this paper, we present a source-to-source compiler that automatically compiles programs written by ordinary users targeting the on-chip parallelism without users specifying parallelism directives. Initially developed for short vector processors, this compiler is extended to support a heterogeneous multi-core CELL processor. Besides parallelism, these processors also introduced various memory constraints such as data alignment and data movement that will affect an application’s performance. Thus we will discuss our compiler strategies for these issues as well. Keyword: On-chip Parallelism, Vectorization, Parallelization, Heterogeneous Multi-core, Memory Hierarchy Performance, Data Alignment, Data Movement, Vector Data Reuse
منابع مشابه
Reliability and Performance Evaluation of Fault-aware Routing Methods for Network-on-Chip Architectures (RESEARCH NOTE)
Nowadays, faults and failures are increasing especially in complex systems such as Network-on-Chip (NoC) based Systems-on-a-Chip due to the increasing susceptibility and decreasing feature sizes. On the other hand, fault-tolerant routing algorithms have an evident effect on tolerating permanent faults and improving the reliability of a Network-on-Chip based system. This paper presents reliabili...
متن کاملPull based Migration of Real-Time Tasks in Multi-Core Processors
1. Problem Description The complexity of uniprocessor design attempting to extract instruction level parallelism has pushed the computer architects to leverage parallelism through multiple simple cores on a single chip. Also, with continuous advancement in chip technology chip multiprocessors (CMP) have become a reality. Multicores are becoming ubiquitous, not only in general-purpose but also e...
متن کاملPull based Migration of Real-Time Tasks in Multi-Core Processors
1. Problem Description The complexity of uniprocessor design attempting to extract instruction level parallelism has motivated the computer architects to leverage parallelism through multiple simple cores on a single chip. Also, with continuous advancement in chip technology chip multiprocessors (CMP) have become a reality. Multicores are becoming ubiquitous, not only in general-purpose but als...
متن کاملEffective Instruction Prefetching In Chip Multiprocessors
threaded application performance, often achieved through instruction level parallelism per chip is increasing, the software and hardware techniques to exploit the potential of studies mostly involve distributed shared memory multiprocessors and fetching will not be fully effective at masking the remote fetch latency. the effective address of the load instructions along that path based upon a hi...
متن کاملCASH: Revisiting Hardware Sharing in Single-Chip Parallel Processors
As the increasing of issue width has diminishing returns with superscalar processor, thread parallelism with a single chip is becoming a reality. In the past few years, both SMT (Simultaneous MultiThreading) and CMP (Chip MultiProcessor) approaches were first investigated by academics and are now implemented by the industry. In some sense, CMP and SMT represent two extreme design points. In thi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006